7 Archaeobotany
The results presented here are preliminary and the chapter has yet to be written.
In this chapter, I will present the macrobotanical data from 190 case studies used to carry on this research (Chapter 3), along with the statistics performed on the data. The results will be first presented temporally, and a discussion of the diachronic trends will follow at the end of the chapter.
7.1 Case studies
The following map shows the sites under investigation, divided by chronology. Please select the desired chronology (or chronologies) from the legend at the top-right of the map.
R = Roman, LR = Late Roman, EMA = Early Middle Ages, Ma = 11th c. onwards7.2 Ubiquity
In Chapter 6, ubiquity has been described as the best solution to present the archaeobotanical remains from the Italian peninsula, given the numerous biases in the samples. The heatmap below (Figure 7.2) provides a good overview of the temporal trends of presence of cereals, legumes, fruits and nuts in the entire area under examination.
Show the code
# Load the libraries
# Note: these libraries are used for the data visualizations in this page.
library(RColorBrewer)
library(reshape2)
library(ggplot2)
library(hrbrthemes)
library(plotly)
library(patchwork)
## UBIQUITY
## Creating a dataframe that contains the ubiquity of each century under examination.
Ubiquity_table <- data.frame(
"I BCE" = archaeobotany_tables(plants_export, -1)$Ubiquity_exp,
"I CE" = archaeobotany_tables(plants_export, 1)$Ubiquity_exp,
"II CE" = archaeobotany_tables(plants_export, 2)$Ubiquity_exp,
"III CE" = archaeobotany_tables(plants_export, 3)$Ubiquity_exp,
"IV CE" = archaeobotany_tables(plants_export, 4)$Ubiquity_exp,
"V CE" = archaeobotany_tables(plants_export, 5)$Ubiquity_exp,
"VI CE" = archaeobotany_tables(plants_export, 6)$Ubiquity_exp,
"VII CE" = archaeobotany_tables(plants_export, 7)$Ubiquity_exp,
"VIII CE" = archaeobotany_tables(plants_export, 8)$Ubiquity_exp,
"IX CE" = archaeobotany_tables(plants_export, 9)$Ubiquity_exp,
"X CE" = archaeobotany_tables(plants_export, 10)$Ubiquity_exp,
"XI CE" = archaeobotany_tables(plants_export, 11)$Ubiquity_exp
)
# Transform the ubiquity table into a matrix
Ubiquity_mat <- as.matrix(Ubiquity_table)
# Rename the centuries
colnames(Ubiquity_mat) <- c("1st c. BCE", "1st c. CE", "2nd c. CE",
"3rd c. CE", "4th c. CE", "5th c. CE",
"6th c. CE", "7th c. CE", "8th c. CE",
"9th c. CE", "10th c. CE", "11th c. CE")
# The data has to be molten to use it with ggplot2
# (package: reshape2)
Ubiquity_melt <- melt(Ubiquity_mat)
# Let's now rename the columns
colnames(Ubiquity_melt) <- c("Taxon", "Century", "Ubiquity")
# Add a column for the text tooltip
Ubiquity_melt <- Ubiquity_melt %>%
mutate(text = paste0("Taxon: ", Taxon, "\n", "Century: ", Century, "\n", "Value: ",round(Ubiquity,2)))
# Create the heatmap with ggplot2
Ubiquity_HM <- ggplot(Ubiquity_melt, aes(Century, Taxon, fill=Ubiquity, text=text)) +
geom_tile(colour="white") +
scale_alpha(range=c(0,1)) +
scale_x_discrete("", expand = c(0, 0)) +
scale_y_discrete("", expand = c(0, 0)) +
theme_grey(base_size = 9) +
theme(legend.position = "right",
axis.ticks = element_blank(),
axis.text.x = element_text(angle = 90, hjust = 0)
) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
labs(
title="Ubiquity",
subtitle="Diachronical heatmap of recorded plant species"
) +
scale_fill_gradient(low = "white", high = "black")7.2.1 Ubiquity by macroregion
The heatmap (Figure 7.2) shows the diachronical ubiquity values of the Italian mainland. It is possible however to compare plants’ ubiquity values in different areas of the peninsula. The R function Ubiquity_macroreg_chrono() (Section 2.3) was created to subset data from Northern, Central and Southern Italian regions, using modern boundaries. Archaeobotanical data from Italy is scarce, and subsetting the dataset required a larger chronological division to have enough sites for a valid statistical interpretation of the results. For this reason, the ubiquity values are presented using the variable Chronology rather than subsetting the individual centuries. For a clearer reading of the plot, the taxa have been divided into–Cereals, Pulses and Fruits/Nuts. Some taxa have been omitted from the plot.
7.2.1.1 Cereals
It is interesting to notice how in the Roman age, cereals are similarly ubiquitous in Southern and Northern Italy, although there are some exceptions (i.e. einkorn, rye, oats, and proso millet) that can derive from the randomness of sampling. Unfortunately, only three sites provided botanical samples for Roman Central Italy and their values have then been omitted from the plot. These sites (all from Tuscany) were studied by the Roman Peasant Project (Bowes and University of Pennsylvania, 2020) and only reported three kinds of cereal: common wheat, emmer, and barley. Similar ubiquity values from Northern and Southern Italy in the Roman age may suggest similar production patterns in the whole Italian mainland, even though more data is required. In the Late Roman age, ubiquity data has been calculated for the three macroregions, with Southern Italy being the least trustworthy (five sites in total). Three crops are found on 62.5-75% of the Central Italian sites: common wheat, barley and emmer. Other cereals are present, but less ubiquitously. The cereal triad aforementioned seems to be diffused in the south as well. Conversely, in Northern Italy common wheat and barley were indeed important cultivations but had to compete with other cereals including millet, sorghum, and rye. The latter had a significant increase, being present on almost 30% of the Northern sites (as opposed to the Roman 15%). The Early Medieval age seems to mark a shift in Italian agricultural practices, as cereal ubiquities are much more variable regionally. In Southern Italy, the triad of common wheat, barley and emmer were still the predominant cereals. These cereals are ubiquitous in Central and Northern Italy as well, although these regions adopt polyculture with a diversified number of cereals. The samples from the Medieval age are fewer in number since the upper boundary of this project’s chronology is the 11th c. Despite the short chronology, it is possible to make some considerations. Medieval Centraly Italy relied heavily on common wheat, barley and emmer, with other cereals increasingly important. Barley is the most ubiquitous cereal in Northern Italy in this period, followed by common wheat, millets and sorghum.

7.2.1.2 Legumes
In the Roman Age, pulses are an important part of the diet and are cultivated both in Northern and Southern Italy. In the latter, vetch/broad beans are present in 22-32% of the samples, and lentils are present in 38% of the sites. In the Late Roman Age, broad beans are equally important in Central and Northern Italy, and peas are present in 50% of the Central Italian sites. In the Early Medieval Age, pulses are present in many Central Italian sites, especially blue/red peas, broad beans and other Fabaceae. Lentils and broad beans are also cultivated in almost half of the Northern Italian sites. The importance of pulses in Central Italy is confirmed by the 11th c. samples, where every specie is present in over 66% of the sites and Fabaceae and blue/red peas are found in every sample. Conversely, in Northern Italy broad bean is found in 66% of the sites.

7.2.1.3 Fruits and nuts
Olive and grape are two essential cultivations in the Italian peninsula. Olive pits, as can be expected, are more ubiquitous in Southern Italy, where in Roman times are present in >87% of the sites and in over 58% of the sites in the following chronologies1. Conversely, grape is important in Central and Northern Italy in the Late Roman, Early Medieval and Medieval ages.

8 Site diversity by chronology
The diversity of sites has been calculated using the package vegan and the Shannon equitability (H’) and Simpson indices, both ranging from 0 (for complete unevenness) to 1 (maximum diversity). For the calculation, only sites that provided cereals, legumes, fruits and nuts have been used. Although this choice entails the loss of many observations, it was necessary in order to produce credible results. The graph shows extended Credible levels for groups with fewer observations. In particular, the Shannon equitability index showed higher variability in the 11th century. Consequently, its credible range is very wide.

9 Context type
Important: This section has to be rewritten as now the approach is not frequentist anymore.
It is possible to examine the distribution of plants across different site types during the four phases under examination. During the Roman period, wheat and barley were most prevalent on rural and religious sites, in the latter being used as part of ritual offerings. Minor grains were widely distributed on religious and urban sites, with rural sites and villas also showing high percentages. Legumes were also used in religious offerings but were found on both urban and rural sites. Grapes were ubiquitous across all Roman sites, with particularly high percentages in urban contexts and being extensively cultivated in large Roman estates. Olives had lower percentages, as they cannot be grown in Northern Italy, where most of the samples were collected. However, they were still widely distributed across all contexts, with peak percentages in urban and religious sites. In the Late Roman period, olive presence significantly decreased in urban sites and villas, but remained relatively unchanged in rural contexts. Similarly to olives, grape decrease in the same contexts, but slightly increase in rural sites. In addition to rural sites, grapes are also diffused in over 66% of the fortified sites (castra) that start to appear in this period. It is noteworthy that fortified sites from the Late Roman period exhibit high percentages of both minor and noble grains, possibly being stocked with the addition of pulses such as faba beans and vetch. In the EMA phase, noble/minor grains and pulses are still present in almost any fortified sites, a situation that is similar to the one in rural sites. In rural sites however minor grains (40%) and pulses (32%) are less ubiquitous in this period, although this figure might suffer from sites that only published cereal remains. Grapes are present in every religious/funerary context, followed by fortified (71%) and rural sites (66%). If olive cultivation seem to decrease in this phase, it might be because most of the sites with archaeobotanical remains are located in Northern Italy. Olives are present in 50% of the religious sites and on 36% of villa sites. Olives are absent from fortified sites in all phases, probably as a reflection of the northern location of these sites. Both rural and urban sites increase the consumption of berries in this period (13-14% ca.), which are also used for funerary offerings (14%).
Code
library(rethinking)
library(tidybayes.rethinking)
library(tidybayes)
####################
## NOBLE GRAINS
####################
Noble_Grains <- with(
Df_Cond_Plants.pa,
data.frame(
Chronology = Chronology,
Macroregion = name_macroreg,
Site_Type = as.factor(Type),
Geo = Geo,
Grain_Count = rowSums(Df_Cond_Plants.pa[,c(11,15)])
)
)
#Simplify categories
Noble_Grains$Site_Type <- str_replace(Noble_Grains$Site_Type, "Religious, monastery", "Religious")
Noble_Grains$Site_Type <- str_replace(Noble_Grains$Site_Type, "Castle", "Fortified")
Noble_Grains$Site_Type <- str_replace(Noble_Grains$Site_Type, "Castrum", "Fortified")
# Convert to list
Noble_Grain.list <- list(
NG_Found = Noble_Grains$Grain_Count,
NG_Tot = 2,
TC_ID = as.factor((interaction(Noble_Grains$Chronology,
Noble_Grains$Site_Type)))
)
Noble_Grain.list$TC_ID <- droplevels(Noble_Grain.list$TC_ID)
####################
## MINOR GRAINS
####################
# Convert to list
Minor_Grains_Type.list <- list(
MG_Found = Minor_Grains$Grain_Count,
MG_Tot = 7,
TC_ID = as.factor((interaction(Minor_Grains$Chronology,
Minor_Grains$Site_Type)))
)
Minor_Grains_Type.list$TC_ID <- droplevels(Minor_Grains_Type.list$TC_ID)
####################
## LEGUMES
####################
Legumes <- with(
Df_Cond_Plants.pa,
data.frame(
Chronology = Chronology,
Macroregion = name_macroreg,
Site_Type = as.factor(Type),
Geo = Geo,
Seed_Count = rowSums(Df_Cond_Plants.pa[,c(21,27)]) # Excludes Unsp. Pulses
)
)
#Simplify categories
Legumes$Site_Type <- str_replace(Legumes$Site_Type, "Religious, monastery", "Religious")
Legumes$Site_Type <- str_replace(Legumes$Site_Type, "Castle", "Fortified")
Legumes$Site_Type <- str_replace(Legumes$Site_Type, "Castrum", "Fortified")
# Convert to list
Legumes.list <- list(
Leg_Found = Legumes$Seed_Count,
Leg_Tot = 7,
TC_ID = as.factor((interaction(Legumes$Chronology,
Legumes$Site_Type)))
)
Legumes.list$TC_ID <- droplevels(Legumes.list$TC_ID)
####################
## GRAPE
####################
Grape <- with(
Df_Cond_Plants.pa,
data.frame(
Chronology = Chronology,
Macroregion = name_macroreg,
Site_Type = as.factor(Type),
Geo = Geo,
Present = ifelse(Grape>0, 1, 0)
)
)
#Simplify categories
Grape$Site_Type <- str_replace(Grape$Site_Type, "Religious, monastery", "Religious")
Grape$Site_Type <- str_replace(Grape$Site_Type, "Castle", "Fortified")
Grape$Site_Type <- str_replace(Grape$Site_Type, "Castrum", "Fortified")
# Convert to list
Grape.list <- list(
Grape_Found = Grape$Present,
TC_ID = as.factor((interaction(Legumes$Chronology,
Legumes$Site_Type)))
)
Grape.list$TC_ID <- droplevels(Grape.list$TC_ID)
####################
## OLIVES
####################
Olives <- with(
Df_Cond_Plants.pa,
data.frame(
Chronology = Chronology,
Macroregion = name_macroreg,
Site_Type = as.factor(Type),
Geo = Geo,
Present = ifelse(Olive>0, 1, 0)
)
)
#Simplify categories
Olives$Site_Type <- str_replace(Olives$Site_Type, "Religious, monastery", "Religious")
Olives$Site_Type <- str_replace(Olives$Site_Type, "Castle", "Fortified")
Olives$Site_Type <- str_replace(Olives$Site_Type, "Castrum", "Fortified")
# Convert to list
Olive.list <- list(
Olive_Found = Olives$Present,
TC_ID = as.factor((interaction(Olives$Chronology,
Olives$Site_Type)))
)
Olive.list$TC_ID <- droplevels(Olive.list$TC_ID)
####################
## NUTS
####################
Nuts <- with(
Df_Cond_Plants.pa,
data.frame(
Chronology = Chronology,
Macroregion = name_macroreg,
Site_Type = as.factor(Type),
Geo = Geo,
Seed_Count = rowSums(Df_Cond_Plants.pa[,c(29,31)])
)
)
#Simplify categories
Nuts$Site_Type <- str_replace(Nuts$Site_Type, "Religious, monastery", "Religious")
Nuts$Site_Type <- str_replace(Nuts$Site_Type, "Castle", "Fortified")
Nuts$Site_Type <- str_replace(Nuts$Site_Type, "Castrum", "Fortified")
# Convert to list
Nuts.list <- list(
Nuts_Found = Nuts$Seed_Count,
Nuts_Tot = 3,
TC_ID = as.factor((interaction(Nuts$Chronology,
Nuts$Site_Type)))
)
Nuts.list$TC_ID <- droplevels(Nuts.list$TC_ID)Code
# Eval is set to false because the models have been saved
# Create a binomial model, where N is 2 because that is the total number of noble
# grains studied in this thesis
m_ng_type <- ulam(
alist(
NG_Found ~ dbinom( 2 , p ),
logit(p) <- TypeChr[TC_ID],
TypeChr[TC_ID] ~ dnorm(0,1)
), data=Noble_Grain.list , chains=4
)
saveRDS(m_ng_type, "stan_models/m_ng_type.rds")
# Create a binomial model, where N is 2 because that is the total number of noble
# grains studied in this thesis
m_mg_type <- ulam(
alist(
MG_Found ~ dbinom( 7 , p ),
logit(p) <- TypeChr[TC_ID],
TypeChr[TC_ID] ~ dnorm(0,1)
), data=Minor_Grains_Type.list , chains=4
)
saveRDS(m_mg_type, "stan_models/m_mg_type.rds")
# Create a binomial model, where N is 2 because that is the total number of noble
# grains studied in this thesis
m_leg_type <- ulam(
alist(
Leg_Found ~ dbinom( 7 , p ),
logit(p) <- TypeChr[TC_ID],
TypeChr[TC_ID] ~ dnorm(0,1)
), data=Legumes.list , chains=4
)
saveRDS(m_leg_type, "stan_models/m_leg_type.rds")
# Create a binomial model, where N is 2 because that is the total number of noble
# grains studied in this thesis
m_grape_type <- ulam(
alist(
Grape_Found ~ dbinom( 1 , p ), # Basically a bernoulli dist
logit(p) <- TypeChr[TC_ID],
TypeChr[TC_ID] ~ dnorm(0,2)
), data=Grape.list , chains=4
)
saveRDS(m_grape_type, "stan_models/m_grape_type.rds")
# Create a binomial model, where N is 2 because that is the total number of noble
# grains studied in this thesis
m_olives_type <- ulam(
alist(
Olive_Found ~ dbinom( 1 , p ), # Basically a bernoulli dist
logit(p) <- TypeChr[TC_ID],
TypeChr[TC_ID] ~ dnorm(0,2)
), data=Olive.list , chains=4
)
saveRDS(m_olives_type, "stan_models/m_olives_type.rds")
# NUTS
# Create a binomial model, where N is 3 because that is the total number of nuts
# studied in this thesis
m_nuts_type <- ulam(
alist(
Nuts_Found ~ dbinom( 3 , p ),
logit(p) <- TypeChr[TC_ID],
TypeChr[TC_ID] ~ dnorm(0,1)
), data=Nuts.list , chains=4
)
saveRDS(m_nuts_type, "stan_models/m_nuts_type.rds")
# BERRIES
# Create a binomial model
m_berries_type <- ulam(
alist(
Berries_Found ~ dbinom( 9 , p ),
logit(p) <- TypeChr[TC_ID],
TypeChr[TC_ID] ~ dnorm(0,1)
), data=Berries_type.list , chains=4
)
saveRDS(m_berries_type, "stan_models/m_berries_type.rds")







9.0.1 Richness and diversity








9.0.1.1 Richness in urban contexts
The richness of plants in urban contexts has been calculated for every phase. As not every sample reported cereals, fruits/nuts and legumes, a correction has also been applied to exclude contexts that did not report the three types of plant taxa under examination. This correction has however strong disadvantages as it removes sites where the absence of certain plant categories may have been deliberate (e.g., storages). However, if the figures vary considerably (especially in the case of the 11th century CE which only consists of 2 samples after the correction), the general trend is similar after the correction. Plant richness in urban contexts slightly decreases after the Roman age, but early medieval cities are richer in plants when compared to the Roman and Late Roman phases. In the 11th century, plant richness increases further. These values will be discussed in the results chapter.
| Chronology | Richness | Samples | Richness* | Samples* |
|---|---|---|---|---|
| R | 6.5 | 44 | 8.45 | 22 |
| LR | 5.9 | 24 | 7.2 | 10 |
| EMA | 7.2 | 25 | 9.6 | 8 |
| Ma | 9 | 8 | 17.5 | 2 |
9.1 Soil types vs Crops
see Rippon et al 2015, p. 81 as a comparison for results and Lodwick chapter in Hamerow and McKerracher 2022, p. 153
10 Macroregion
10.1 Cereals
10.1.1 Richness and diversity
Cereals share similar presence values in Roman Northern and Southern Italian sites (Figure 10.1). Central Italy reports higher values, although this is based only on three sites and hence it is not reliable. During the Early Middle Ages, Central Italy again is the richest in cereals, closely followed by Northern Italy. Interestingly, Southern Italy still reports values very close to the Roman age. A full list of the Southern Italian EMA sites is reported in Table 10.1.


| ID | Site | Region | Geography | Type | Culture/Influence |
|---|---|---|---|---|---|
| 98 | S. Maria in Cività, D85 | Molise | Hilltop | Urban | Lombard |
| 107 | S. Giovanni di Ruoti, Phase 3A | Basilicata | Mountain | Monastery | Lombard |
| 107 | S. Giovanni di Ruoti, Phase 3B | Basilicata | Mountain | Monastery | Lombard |
| 198 | Salapia, area botteghe, US 2475 | Puglia | Coast/Lagoon | Urban | Lombard |
| 198 | Salapia, area botteghe, US 2437 | Puglia | Coast/Lagoon | Urban | Lombard |
| 199 | Salapia, area conceria, US 2054 | Puglia | Coast/Lagoon | Urban | Lombard |
| 199 | Salapia, area conceria, US 2211-2217 | Puglia | Coast/Lagoon | Urban | Lombard |
| 199 | Salapia, area conceria, 8th-9th c. | Puglia | Coast/Lagoon | Urban | Lombard |
| 196 | Faragola, wastepit 61 | Puglia | Plain | Rural, villa | Lombard |
| 196 | Faragola, wastepit 66 | Puglia | Plain | Rural, villa | Lombard |
| 234 | Colle Castellano, Phase 3-4 | Molise | Hill | Urban | Lombard |
| 177 | San Vincenzo al Volturno, kitchen area | Molise | Hill | Monastery | Lombard |
| 101 | Supersano, loc. Scorpo | Puglia | Plain | Rural | Byzantine |
| 250 | Apigliano, 9th-10th c., pits | Puglia | Plain | Rural | Byzantine |
| 250 | Apigliano, 10th-11th c., pits | Puglia | Plain | Rural | Byzantine |
| 196 | Faragola, granary A7 | Puglia | Plain | Rural, villa | Lombard |
| 196 | Faragola, granary A8 | Puglia | Plain | Rural, villa | Lombard |
10.1.2 Assessing the difference
Permutational multivariate analysis of variance (PERMANOVA) is a non-parametric multivariate statistical test used to compare group of objects. By using measure space, the null hypothesis that the centroids and dispersion of groups are identical is tested. The null hypothesis is rejected if either the centroid or the spread of the objects differs between the groups. A prior calculation of the distance between any two objects included in the experiment is used to determine whether the test is valid or not2 (Anderson, 2017). In this context, the null hypothesis is that there is no regional difference in the cereals dataset, with cereals being evenly distributed across macroregions and chronologies.
The suggestion of an Early Medieval shift in cereal farming stated in Section 7.2.1 and Section 10.1.1 needs statistical support. Considering that data is not unimodal and that we are dealing with presence/absence analysis, the best choice is to use a non-parametric test as PERMANOVA on the early medieval botanical dataset. Prior to performing the test, it was necessary to pre-process data by:
Selecting all the cereals columns of the plant remains table, keeping the categorical variables:
Macroregion.Removing empty rows (samples that only reported seeds/fruits, but not cereals).
Transforming the raw counts into presence/absence, using the function
decostand()(method=pa) in theRpackagevegan(Oksanen et al., 2020).
Show the code: Pre-processing
# Testing the results: Regionality in the dataset?
library(vegan)
set.seed(29)
# Pre processing: remove empty rows
# Note: The input table is the CONDENSED table without totals
# Selecting all the cereals columns of the plant remains table, keeping some categorical variables
cer_macroreg_ubiquity_transp.tot <- Df_Cond_Plants[c(4,5,6,7,11:19)]
# Selecting all rows with data (since we selected only with cereals, and some sites only had fruits/pulses we might have empty rows)
cer_macroreg_ubiquity_transp.tot <- cer_macroreg_ubiquity_transp.tot[rowSums(cer_macroreg_ubiquity_transp.tot[5:13])>0,]
# Assigning a column name "Macroregion"
colnames(cer_macroreg_ubiquity_transp.tot)[1] = "Macroregion"
# Selecting the Chronology of interest (EMA) and excluding Central Italy
cer_macroreg_ubiquity_transp.tot<- filter(cer_macroreg_ubiquity_transp.tot, Macroregion!="Central Italy" & Chronology=="EMA")
# dividing categorical and numerical columns
cer_macroreg_ubiquity_transp.categ <- cer_macroreg_ubiquity_transp.tot[1:4]
cer_macroreg_ubiquity_transp.data <- cer_macroreg_ubiquity_transp.tot[5:13]
# Converting the numerical columns into a presence/absence matrix (using method=pa)
cer_macroreg_ubiquity_transp.dist <- decostand(cer_macroreg_ubiquity_transp.data, method="pa", na.rm=TRUE)After the pre-processing, it was possible to run the PERMANOVA using the function adonis2() in the package vegan. The function creates a distance matrix and computes an analysis of variance on the matrix. The method chosen to calculate the distance matrix is the jaccard distance. The Jaccard distance, based on the Jaccard similarity index, is a value of dissimilarity between sample sets (Kosub, 2019). When compared to other dissimilarity indices, it is more appropriate for presence/absence analyses as it is not based on Euclidean distance.
Results of adonis2().
Show the code: adonis2()
cer_macroreg_ubiquity_transp.div <- adonis2(
cer_macroreg_ubiquity_transp.dist ~ Macroregion,
data = cer_macroreg_ubiquity_transp.categ,
permutations = 10000, method="jaccard"
)Permutation test for adonis under reduced model
Terms added sequentially (first to last)
Permutation: free
Number of permutations: 10000
adonis2(formula = cer_macroreg_ubiquity_transp.dist ~ Macroregion, data = cer_macroreg_ubiquity_transp.categ, permutations = 10000, method = "jaccard")
Df SumOfSqs R2 F Pr(>F)
Macroregion 1 1.2761 0.12134 7.3192 9.999e-05 ***
Residual 53 9.2407 0.87866
Total 54 10.5169 1.00000
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The calculation of PERMANOVA on the Roman dataset using the variable Macroregion reported a not statistically significant p-value, suggesting a degree of homogeneity. On the other hand, when applied to the early Medieval dataset, the results were highly significant (0<p<0.001) with a 99.99% confidence. PERMANOVA has not been calculated on the Late Roman dataset as samples from Southern Italy were too scarce. To confirm the significance of the results of PERMANOVA applied to the EMA dataset, it is necessary to check that its assumptions are met (especially since we are dealing with small groups of data). Firstly, we check the homogeneity of variances. The function betadisper() from the package vegan provides the distances of group samples from centroids. If the variation is even, the null hypothesis of no difference in dispersion between groups is accepted. To test the variation, it is possible to use the analysis of variance (ANOVA).
Show the code: betadisper()
# We do not need to calculate the distance separately, but it will be useful later for the betadisper() function
# Distance dissimilarity matrix with the Jaccard method
cer_macroreg_ubiquity_transp.dist2 <- vegdist(cer_macroreg_ubiquity_transp.dist, method="jaccard", na.rm=TRUE)
# Betadisper: distances of group samples from centroids
cer_macroreg_ubiquity_transp.betadisper <- betadisper(cer_macroreg_ubiquity_transp.dist2, cer_macroreg_ubiquity_transp.categ$Macroregion)Results of anova() on the betadisper.
Show the code: ANOVA on betadisper()
# We will see that the ANOVA's p-value is not significant meaning that group dispersions are homogenous
#("Null hypothesis of no difference in dispersion between groups"; https://www.rdocumentation.org/packages/vegan/versions/2.4-2/topics/betadisper).
anova(cer_macroreg_ubiquity_transp.betadisper) # This should not be significant!Analysis of Variance Table
Response: Distances
Df Sum Sq Mean Sq F value Pr(>F)
Groups 1 0.00379 0.0037868 0.1262 0.7238
Residuals 53 1.59020 0.0300038


The betadisper() graphs (Figure 10.2) show similar distances from the centroids for the categories Northern Italy and Southern Italy. In addition, the ANOVA on the betadisper() shows that the separation is not significant (p-value over the significance threshold), meaning that the groups dispersions are homogeneous. We can thus be confident in the PERMANOVA results and accept the difference between the two groups of sites under investigation. In other words, the Southern and Northern Italian group of sites are different during the Early Middle Ages for what concerns cereal farming.
Running the same tests on the Roman sites failed to separate the two groups of sites, confirming that there was not a significant difference in the types of cereals cultivated during the Roman age between Northern and Southern Italy.
10.1.3 Quantifying the separation: nMDS
The Wasserstein distance (or earth’s mover distance) is a measure of distance between two probability distributions on a metric space.
In addition to statistically testing the separation between the Northern and Southern Italian early medieval cereals dataset (Section 10.1.2), it is possible to measure the distance between groups of sites both in the Roman and early Middle ages. For this task, a dimensionality reduction algorithm has been chosen: the non-metric multidimensional scaling, from the R library vegan. A more in-depth explanation of the algorithm can be read in Section 6.6.2.3. To avoid fallacy in computations, the macroregion Central Italy and the chronologies LR (Late Roman) and Ma (11th c. onwards) have been excluded from this test—the uneven distribution of the group of samples required a cautious approach. The nMDS has been run with a reduction to only one dimension, using KDE plots to visualize the results. Setting the dimension to one allows easier calculations of distance. In Figure 10.3 (a), it is possible to see the nMDS performed on the Roman cereals presence/absence dataset. As already pointed out, the PERMANOVA did not produce significant results for this dataset and the Wasserstein distance (calculated with the wasserstein1d() function in the transport library) is indeed shorter for the Roman dataset. For both chronologies there is an overlap in the curves, which is more considerable in the Roman age (indicating that the group of samples are more similar). The overlap for the EMA groups (Figure 10.3, b) is probably due to the fact that the presence of the noble grains is not by itself a ‘marker’ of Southern Italian sites—these grains are also very common in the North. The difference is that in the South noble grains are not cultivated in conjunction with other grains. The graph for the EMA chronology shows a clearer separation of the macroregional groups, with some minor overlaps. Moreover, the graph also displays variability in the Northern Italian dataset. The variability can also be assessed from the outliers in the boxplots in Figure 10.4.
Source code:
Show the code: Data preparation
# DATA PREPARATION #
# Creating a simple dataframe just for the purpose of the NCA
Df_Cond_Plants_simplified <- data.frame(
"Chronology" = Df_Cond_Plants$Chronology,
"Type" = Df_Cond_Plants$Type,
"Macroregion" = Df_Cond_Plants$name_macroreg,
"Weight" = Df_Cond_Plants$weight,
Df_Cond_Plants[11:ncol(Df_Cond_Plants)]
)
# Joining site typology categories to simplify the output, using library stringr
library(vegan)
library(stringr)
Df_Cond_Plants_simplified$Type <- str_replace_all(
Df_Cond_Plants_simplified$Type, c(
"Rural site, villa" = "Rural",
"Rural site, mansio" = "Rural",
"Religious, monastery" = "Religious",
"Castle" = "Fortified",
"Castrum" = "Fortified")
)
# Convert to Presence (1) / Absence (0) using library vegan
Df_Cond_Plants_simplified[5:ncol(Df_Cond_Plants_simplified)] <- decostand(Df_Cond_Plants_simplified[5:ncol(Df_Cond_Plants_simplified)], method="pa")
# Creating dataframes for the chronologies R and EMA
Df_Cereals_R <- filter(Df_Cond_Plants_simplified, Chronology=="R" & Macroregion!="Central Italy")
Df_Cereals_EMA <- filter(Df_Cond_Plants_simplified, Chronology=="EMA" & Macroregion!="Central Italy")
# Selecting only cereals from both dataframes
Df_Cereals_R <- Df_Cereals_R[1:14]
Df_Cereals_R[is.na(Df_Cereals_R)] <- 0
Df_Cereals_R <- Df_Cereals_R[rowSums(Df_Cereals_R[,5:14] != 0) > 0, ] # Removing empty rows
Df_Cereals_EMA <- Df_Cereals_EMA[1:14]
Df_Cereals_EMA[is.na(Df_Cereals_EMA)] <- 0
Df_Cereals_EMA <- Df_Cereals_EMA[rowSums(Df_Cereals_EMA[,5:14] != 0) > 0, ]



10.1.4 Noble grains

10.1.5 Minor grains

10.2 Legumes

10.3 Fruits
10.3.1 Grape

10.3.2 Olives

10.3.3 Nuts

10.3.4 Domestic and wild fruits


11 Altitude
The creation of altitude models was not possible for every plant investigated in this study. As the process is very time consuming, it was necessary to select only plants that are most likely to be either positively or negatively impacted by altitude, or taxa that are cultivated for economic reasons. Primary importance was thus given to cereals. Since not every cereal can be cultivated at high altitudes, it was more appropriate to model them individually instead of grouping them into noble and minor grains.
11.0.1 Model specification and priors
\[ P_{i} \sim Bernoulli(\bar{p}_{i}) \]
\[ logit(\bar{p}_{i}) = \alpha_{[ChrID]} + \beta_{[ChrID]}\cdot Alt_{i} \]
\[ \alpha_{ChrID} \sim Normal(0,1.5) \]
\[ \beta_{ChrID} \sim Normal(0,1.5) \]
\[ \phi \sim Exponential(1) \]

11.1 Cereals







11.1.0.1 Community plot

11.2 Legumes

11.3 Fruits
11.3.1 Grape

11.3.2 Olives

12 Model comparisons
12.1 Precipitation
12.1.1 Model specification and priors
12.1.1.1 Model 1: Precipitation only
The following model uses standardized mean annual precipitation values as a coefficient for the slope and weakly informative priors for both the intercept and the slope.
\[ P_{i} \sim Bernoulli(\bar{p}_{i}) \]
\[ logit(\bar{p}_{i}) = \alpha_{[ChrID]} + bPrecip_{[ChrID]}\cdot MeanAnnualPrecip \]
\[ \alpha_{ChrID} \sim Normal(0,1.3) \]
\[ bPrecip_{ChrID} \sim Normal(0,1.3) \]
\[ \phi \sim Exponential(1) \]
12.1.1.2 Model 2: Altitude only
The following model uses altitude values (in km) as a coefficient for the slope and weakly informative priors for both the intercept and the slope. It is the same model presented before.
\[ P_{i} \sim Bernoulli(\bar{p}_{i}) \]
\[ logit(\bar{p}_{i}) = \alpha_{[ChrID]} + bAlt_{[ChrID]}\cdot Altitude \]
\[ \alpha_{ChrID} \sim Normal(0,1.3) \]
\[ bAlt_{ChrID} \sim Normal(0,1.3) \]
\[ \phi \sim Exponential(1) \]
12.1.1.3 Model 3: Altitude and Precipitation
The following model uses altitude (in km) and standardized mean annual precipitation values as coefficients for the slope, and weakly informative priors for both the intercept and the slope.
\[ P_{i} \sim Bernoulli(\bar{p}_{i}) \]
\[ logit(\bar{p}_{i}) = \alpha_{[ChrID]} + bAlt_{[ChrID]}\cdot Altitude + bPrecip_{[ChrID]}\cdot MeanAnnualPrecip \]
\[ \alpha_{ChrID} \sim Normal(0,1.3) \]
\[ bAlt_{ChrID} \sim Normal(0,1.3) \]
\[ bPrecip_{ChrID} \sim Normal(0,1.3) \]
\[ \phi \sim Exponential(1) \]
12.1.1.4 Prior predictive simulations
We also want to make sure that the priors are weakly informative and are not deeply impacting the models.


12.1.2 Performance and comparisons
The comparison of the three models using the WAIC measure does not show a significant difference between the three models. Although for simplicity only the common wheat models are shown, repeated calculation for several other plants (including non cereal plants) did not show a significant impact of adding mean annual precipitation to the model.
WAIC SE dWAIC dSE pWAIC weight
m_Cwheat_precip 274.7821 12.76496 0.000000 NA 7.086813 0.76350062
m_CWheat_precip_alt 277.9369 13.14621 3.154749 1.899620 9.125909 0.15767553
m_CWheat_altitude 279.3235 13.53368 4.541397 5.226394 6.046197 0.07882385


12.2 Temperature
12.2.1 Model specification and priors
12.2.1.1 Model 1: Temperature only
The following model uses mean annual temperature values as a coefficient for the slope and weakly informative priors for both the intercept and the slope.
\[ P_{i} \sim Bernoulli(\bar{p}_{i}) \]
\[ logit(\bar{p}_{i}) = \alpha_{[ChrID]} + bTemp_{[ChrID]}\cdot MeanAnnualTemp \]
\[ \alpha_{ChrID} \sim Normal(0,1.3) \]
\[ bTemp_{ChrID} \sim Normal(0,1.3) \]
\[ \phi \sim Exponential(1) \]
12.2.1.2 Model 2: Altitude only
The following model uses altitude values (in km) as a coefficient for the slope and weakly informative priors for both the intercept and the slope. It is the same model presented before.
\[ P_{i} \sim Bernoulli(\bar{p}_{i}) \]
\[ logit(\bar{p}_{i}) = \alpha_{[ChrID]} + bAlt_{[ChrID]}\cdot Altitude \]
\[ \alpha_{ChrID} \sim Normal(0,1.3) \]
\[ bAlt_{ChrID} \sim Normal(0,1.3) \]
\[ \phi \sim Exponential(1) \]
12.2.1.3 Model 3: Altitude and Temperature
The following model uses altitude (in km) and standardized mean annual temperature values as coefficients for the slope, and weakly informative priors for both the intercept and the slope.
\[ P_{i} \sim Bernoulli(\bar{p}_{i}) \]
\[ logit(\bar{p}_{i}) = \alpha_{[ChrID]} + bAlt_{[ChrID]}\cdot Altitude + bTemp_{[ChrID]}\cdot MeanAnnualTemp \]
\[ \alpha_{ChrID} \sim Normal(0,1.3) \]
\[ bAlt_{ChrID} \sim Normal(0,1.3) \]
\[ bTemp_{ChrID} \sim Normal(0,1.3) \]
\[ \phi \sim Exponential(1) \]
12.2.1.4 Prior predictive simulations
We also want to make sure that the priors are weakly informative and are not deeply impacting the models.


12.2.2 Performance and comparisons
The creation of models for each plant is extremely time consuming, and for the purposes of this comparison I chose to compare only taxa that are known to be more sensitive to temperature and that have economic implications.
12.2.2.1 Common wheat
The comparison of the three models using the WAIC, although showing a slightly larger distance as opposed to the precipitation models, is still not significant enough to justify the introduction of another predictor in the model.
WAIC SE dWAIC dSE pWAIC
m_Cwheat_temperature 278.6323 14.34055 0.0000000 NA 7.874550
m_CWheat_altitude 279.3235 13.53368 0.6912639 4.2142679 6.046197
m_CWheat_temperature_alt 281.4353 14.70436 2.8030424 0.8399438 9.507590
weight
m_Cwheat_temperature 0.5117720
m_CWheat_altitude 0.3622184
m_CWheat_temperature_alt 0.1260096


12.2.2.2 Rye
The comparison of the three models using the WAIC is not significant enough to justify the introduction of another predictor in the model.
WAIC SE dWAIC dSE pWAIC
m_Rye_temperature 229.9209 18.08467 0.000000 NA 8.623651
m_Rye_temperature_alt 232.3080 18.31918 2.387091 1.072195 10.211306
m_Rye_altitude 240.4145 16.09612 10.493538 9.147829 5.985891
weight
m_Rye_temperature 0.764286992
m_Rye_temperature_alt 0.231689419
m_Rye_altitude 0.004023589


12.2.2.3 Legumes
The comparison of the three models using the WAIC is not significant enough to justify the introduction of another predictor in the model.
WAIC SE dWAIC dSE pWAIC
m_Legumes_altitude 890.8435 35.15337 0.000000 NA 14.51364
m_Legumes_temperature 894.4890 35.14409 3.645444 7.649821 16.00504
m_Legumes_temperature_alt 897.6609 35.27224 6.817366 6.403353 22.00542
weight
m_Legumes_altitude 0.83705120
m_Legumes_temperature 0.13525518
m_Legumes_temperature_alt 0.02769362


12.2.2.4 Grape
The combination of temperature and altitude as predictors for grape occurrence did not produce any strong enough difference that would favour the use of a predictor in place of the other.
WAIC SE dWAIC dSE pWAIC weight
m_grape_altitude 272.7094 11.88115 0.000000 NA 5.371696 0.70273289
m_Grape_temperature_alt 274.5742 13.48959 1.864836 5.361352 7.709453 0.27659630
m_Grape_temperature 279.7619 13.69224 7.052509 7.102870 6.392521 0.02067081


12.2.2.5 Olive
If the model is applied to other types of plants, the situation changes. In the case of olives the prediction accuracy increases when the mean annual temperature and the altitudes are both slope coefficients.
WAIC SE dWAIC dSE pWAIC
m_Olive_temperature_alt 225.6696 14.97050 0.000000 NA 6.386533
m_Olive_temperature 226.7000 15.39914 1.030409 1.288141 5.801423
m_Olive_altitude 260.5728 12.54551 34.903251 10.757408 4.181470
weight
m_Olive_temperature_alt 6.260257e-01
m_Olive_temperature 3.739743e-01
m_Olive_altitude 1.649862e-08


Given that the model which used temperature and altitude performed better than the individual slope models, it is worth it to look at the predicted olive occurrence probability plotted against:
the standardised mean annual temperatures.
elevation (km).
The chronological differences seem to be consistent until the 11th century. Sites with higher temperatures have a higher likelihood of consuming/producing olives. Likewise, sites with higher altitudes have a reduced probability of consuming or producing the same plant.


13 To do:
13.1 Wine/Olive oil production in Italy
Questions: Is there a decrease in the production of cash crops in Late Roman villas? In particular in central Italian villas.
Available data:
Seed remains
Wine presses in Roman Italy (1st to 6th c. ca) -> Dodd
Wine/Olive presses, dated but no info on wine/olive or size: http://oxrep.classics.ox.ac.uk/databases/olive_oil_and_wine_presses_database/
Dodd: “a notable increase in size from around the first century BCE.” No quantitative data provided.
13.2 Fruit consumption
The domesticated and the wild
What fruits arrive on the Roman table?
Can we see a decreased consumption of figs during the LALIA in the north?